Skip to content

Conversation

moorabbit
Copy link
Contributor

The following AVX[512] intrinsics are now constexpr:

  • _mm512_mask_cvtepi8_epi32
  • _mm512_maskz_cvtepi8_epi32
  • _mm512_mask_cvtepi8_epi64
  • _mm512_maskz_cvtepi8_epi64
  • _mm512_mask_cvtepi16_epi32
  • _mm512_maskz_cvtepi16_epi32
  • _mm512_mask_cvtepi16_epi64
  • _mm512_maskz_cvtepi16_epi64
  • _mm512_mask_cvtepi32_epi64
  • _mm512_maskz_cvtepi32_epi64
  • _mm512_mask_cvtepu8_epi32
  • _mm512_maskz_cvtepu8_epi32
  • _mm512_mask_cvtepu8_epi64
  • _mm512_maskz_cvtepu8_epi64
  • _mm512_mask_cvtepu16_epi32
  • _mm512_maskz_cvtepu16_epi32
  • _mm512_mask_cvtepu16_epi64
  • _mm512_maskz_cvtepu16_epi64
  • _mm512_mask_cvtepu32_epi64
  • _mm512_maskz_cvtepu32_epi64
  • _mm512_mask_cvtepi8_epi16
  • _mm512_maskz_cvtepi8_epi16
  • _mm512_mask_cvtepu8_epi16
  • _mm512_maskz_cvtepu8_epi16
  • _mm_cvtepi16_epi8
  • _mm256_cvtepi16_epi8
  • _mm256_mask_cvtepi16_epi8
  • _mm256_maskz_cvtepi16_epi8

This PR is part 1 of a series of PRs fixing #154539

@moorabbit
Copy link
Contributor Author

@RKSimon these functions:

static __inline__ __m128i __DEFAULT_FN_ATTRS128
_mm_mask_cvtepi16_epi8 (__m128i __O, __mmask8 __M, __m128i __A) {
return (__m128i) __builtin_ia32_pmovwb128_mask ((__v8hi) __A,
(__v16qi) __O,
__M);
}
static __inline__ __m128i __DEFAULT_FN_ATTRS128
_mm_maskz_cvtepi16_epi8 (__mmask8 __M, __m128i __A) {
return (__m128i) __builtin_ia32_pmovwb128_mask ((__v8hi) __A,
(__v16qi) _mm_setzero_si128(),
__M);
}

Can't be made constexpr because __builtin_ia32_pmovwb128_mask is not constexpr.

Is there a plan to remove __builtin_ia32_pmovwb128_mask and replace it with a __builtin_ia32_selectb_128? Following the approach here:

static __inline__ __m128i __DEFAULT_FN_ATTRS256
_mm256_mask_cvtepi16_epi8 (__m128i __O, __mmask16 __M, __m256i __A) {
return (__m128i)__builtin_ia32_selectb_128((__mmask16)__M,
(__v16qi)_mm256_cvtepi16_epi8(__A),
(__v16qi)__O);
}

@RKSimon RKSimon self-requested a review September 15, 2025 16:06
@RKSimon
Copy link
Collaborator

RKSimon commented Sep 18, 2025

At the moment we need to keep _mm_mask_cvtepi16_epi8 etc as the selection mask only acts on part of the vector (we're truncating v8i16 to v8i8, using the mask8 on that and passing through all the upper half of the passthrough to create a v16i8) - the best we might be able to do is a custom __builtin_ia32_selectb_64 but it'll be messy - we'll have to come back to it later, maybe when the _mm512_kunpack intrinsics are constexpr?

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - cheers

@RKSimon RKSimon changed the title [Headers][X86] Add constexpr support for some AVX[512] intrinsics. [Headers][X86] Add constexpr support for some AVX512 masked extension/truncation intrinsics. Sep 18, 2025
@RKSimon RKSimon enabled auto-merge (squash) September 18, 2025 10:43
@RKSimon RKSimon merged commit 226b0a9 into llvm:main Sep 18, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants